Information transmission through a noisy quantum channel 
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Noisy quantum channels may be used in many information carrying applications. We show that 
different applications may result in different channel capacities. Upper bounds on several of these 
capacities are proved. These bounds are based on the coherent information, which plays a role in 
quantum information theory analogous to that played by the mutual information in classical in- 
formation theory. Many new properties of the coherent information and entanglement fidelity are 
proved. Two non-classical features of the coherent information are demonstrated: the failure of 
subadditivity, and the failure of the pipelining inequality. Both properties arise as a consequence 
of quantum entanglement, and give quantum information new features not found in classical in- 
formation theory. The problem of a noisy quantum channel with a classical observer measuring 
the environment is introduced, and bounds on the corresponding channel capacity proved. These 
bounds are always greater than for the unobserved channel. We conclude with a summary of open 
problems. 
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I. INTRODUCTION 

A central result of Shannon's classical theory of infor- 
mation |1^-01 is the noisy channel coding theorem. This 
result provides an effective procedure for determining the 
capacity of a noisy channel - the maximum rate at which 
classical information can be reliably transmitted through 
the channel. There has been much recent work on quan- 
tum analogues of this result ^^ ■ 

This paper has two central purposes. The first purpose 
is to develop general techniques for proving upper bounds 
on the capacity of a noisy quantum channel, which are ap- 
plied to several different classes of quantum noisy channel 
problems. Second, we point out some essentially new fea- 
tures that quantum mechanics introduces into the noisy 
channel problem. 

The paper is organized as follows. In section y we 
give a basic introduction to the problem of the noisy 
quantum channel, and explain the key concepts. Sec- 
tion III reviews the quantum operations formalism that 



is used throughout the paper to describe a noisy quan- 



tum channel, and section IV reviews the concept of the 



entropy exchange associated with a quantum operation. 
Section M shows how the classical noisy channel coding 
theorem can be put into the quantum language, and ex- 
plains why the capacities that arise in this context are 



not useful for applications such as quantum computing 



and teleportation. Section VI discusses the entanglement 



fidelity, which is the measure we use to quantify how well 
a state and its entanglement are transmitted through a 
noisy quantum channel. Section VII discusses the co- 
herent information introduced in |^] as an analogue to 
the concept of mutual information in classical informa- 
tion theory. Many new results about the coherent infor- 
mation are proved, and we show that quantum entan- 
glement allows the coherent information to have proper- 
ties which have no classical analogue. These properties 
are critical to understanding what is essentially quantum 
about the quantum noisy channel coding problem. Sec- 
tion VII] brings us back to noisy channel coding, and 



formally sets up the class of noisy channel coding prob- 
lems we consider. Section [X proves a variety of upper 
bounds on the capacity of a noisy quantum channel, de- 
pending on what class of coding schemes one is willing 
to allow. This is followed in section M by a discussion 
of the achievability of these upper bounds and of earlier 
work on channel capacity. Section KJ formulates the new 
problem of a noisy quantum channel with measurement, 
allowing classical information about the environment to 
be obtained by measurement, and then used during the 
decoding process. Upper bounds on the corresponding 
channel capacity are proved. Finally, section XIl con- 
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eludes with a summary of our results, a discussion of 
the new features which quantum mechanics adds to the 
problem of the noisy channel, and suggestions for further 
research. 



II. NOISY CHANNEL CODING 

The problem of noisy channel coding will be outlined in 
this section. Precise definitions of the concepts used will 
be given in later sections. The procedure is illustrated in 
figure 0. 
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FIG. 1. 
The noisy quantum channel, together with encodings 
and decodings. 

There is a quantum source emitting unknown quantum 
states, which we wish to transmit through the channel to 
some receiver. Unfortunately, the channel is usually sub- 
ject to noise, which prevents it from transmitting states 
with high fidelity. For example, an optical fiber suffers 
losses during transmission. Another important example 
of a noisy quantum channel is the memory of a quan- 
tum computer. There the idea is to transmit quantum 
states in time. The effect of transmitting a state from 
time ii to ^2 can be described as a noisy quantum chan- 
nel. Quantum teleportation can also be described as a 
noisy quantum channel whenever there are imperfections 
in the teleportation process [p|,fo| . 

The idea of noisy channel coding is to encode the quan- 
tum state emitted by the source, p^, which one wishes to 
transmit, using some encoding operation, which we de- 
note C. The encoded state is then sent through the chan- 
nel, whose operation we denote by M. The output state 
of the channel is then decoded using some decoding opera- 
tion, T). The objective is for the decoded state to match 
with high fidelity the state emitted by the source. As 
in the classical theory, we consider the fidelity of large 
blocks of material produced by repeated emission from 
the source, and allow the encoding and decoding to oper- 
ate on these blocks. A channel is said to transmit a source 
reliably if a sequence of block-coding and block-decoding 
procedures can be found that approaches perfect fidelity 
in the limit of large block size. 

What then is the capacity of such a channel - the high- 
est rate at which information can be reliably transmitted 
through the channel? The goal of a channel capacity the- 
orem is to provide a procedure to answer this question. 
This procedure must be an effective procedure, that is, an 
explicit algorithm to evaluate the channel capacity. Such 
a theorem comes in two parts. One part proves an upper 



bound on the rate at which information can be reliably 
transmitted through the channel. The other part demon- 
strates that there are coding and decoding schemes which 
attain this bound, which is therefore the channel capac- 
ity. We do not prove such a channel capacity theorem in 
this paper. We do, however, derive bounds on the rate at 
which information can be sent through a noisy quantum 
channel. 



III. QUANTUM OPERATIONS 

What is a quantum noisy channel, and how can it be 
described mathematically? This section reviews the for- 
malism of quantum operations, which is used to describe 
noisy channels. Previous papers on the noisy channel 
problem [^-§1 have used apparently different formalisms 
to describe the noisy channel. In fact, all the formalisms 
can be shown to be equivalent, as we shall see in this sec- 
tion. Historically, quantum operations have also some- 
times been known as completely positive maps or super- 
scattering operators. The motivation in all cases has been 
to describe general state changes in quantum mechanics. 

A simple example of a state change in quantum me- 
chanics is the unitary evolution experienced by a closed 
quantum system. The final state of the system is related 
to the initial state by a unitary transformation U , 



p ^ £{p) = C/pC/t 



(3.1) 



Although all closed quantum systems are described by 
unitary evolutions, in accordance with Schrodinger's 
equation, more general state changes are possible for 
open quantum systems, such as noisy quantum channels. 
How does one describe a general state change in quan- 
tum mechanics? The answer to this question is provided 
by the quantum operations formalism. This formalism is 
described in detail by Kraus ||ll| (see also Hellwig and 
Kraus p2|) and is given short but detailed reviews in 
Choi |13| and in the Appendix to M. In this formalism 
there is an input state and an output state, which are 
connected by a map 



P 



£{p) 

tr(£(p)) 



(3.2) 



This map is a quantum operation, £, a linear, trace- 
decreasing map that preserves positivity. The trace in 
the denominator is included in order to preserve the trace 
condition, tr(p) — 1. 

The most general form for £ that is physically reason- 
able (in addition to being linear and trace-decreasing and 
preserving positivity, a physically reasonable £ must sat- 
isfy an additional property called complete positivity), 
can be shown to be [^ 



£{p) = Y,A,pA\ 



(3.3) 



The system operators Ai, which must satisfy 
^^ A'-Ai < I, completely specify the quantum opera- 
tion. In the particular case of a unitary transformation, 
there is only one term in the sum, Ai = U, leaving us 
with the transformation (pj). 

A class of operations that is of particular interest are 
the trace-preserving or non-selective operations. Physi- 
cally, these arise in situations where the system is coupled 
to some environment which is not under observation; the 
effect of the evolution is averaged over all possible out- 
comes of the interaction with the environment. Trace- 
preserving operations are defined by the requirement that 



J24A.-I. 



(3.4) 



This is equivalent to requiring that for all density opera- 
tors p, 



tr(£(p)) = 1, 



(3.5) 



explaining the nomenclature "trace-preserving" . Notice 
that this means the evolution equation ( p^ ) reduces to 
the simpler form 



f(p), 



(3.6) 



when £ is trace-preserving. 

The following representation theorem is proved in [[ll| , 
||l3| , and |j]. It shows the connection between trace- 
preserving quantum operations and systems interacting 
unitarily with an environment, and thus provides part of 
the justification for the physical interpretation of trace- 
preserving quantum operations described above. 

Theorem (representation theorem for trace-preserving 
quantum operations): 

Suppose £ is a trace-preserving quantum operation on 
a system with a d-dimensional state space. Then it is 
possible to construct an "environment" E of at most d? 
dimensions, such that the system plus environment are 
initially uncorrelated, the environment is initially in a 
pure state a = \s){s\ and there exists a unitary evolution 
U on system and environment such that 



£{p)^trE{U{p(g)a)U^). 



(3.7) 



Here and elsewhere in the paper a subscript on a trace 
indicates a partial trace over the corresponding system 
{E in this case). 

Conversely, given any initially uncorrelated environ- 
ment a (possibly of more than cP dimensions, and ini- 
tially impure), a unitary interaction U between the sys- 
tem and the environment gives rise to a trace-preserving 
quantum operation. 



£{p)^tYE{U{p(g>a)U^). 



(3.8) 



This theorem tells us that any trace-preserving quan- 
tum operation can always be mocked up as a unitary evo- 
lution by adding an environment with which the system 



can interact unitarily. Conversely, it tells us that any 
such unitary interaction with an initially uncorrelated 
environment gives rise to a trace-preserving quantum op- 
eration. Both these facts are useful in what follows. The 
picture we have of a quantum operation is neatly sum- 
marized by the following diagram. 
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FIG. 2. 



Here, Q denotes the state of the system before the in- 
teraction with the environment, and Q' the state of the 
system after the interaction. Unless stated otherwise we 
follow the convention that Q and Q' are d-dimensional. 
The environment system E and the operator U'^^ might 
be chosen to be the actual physical environment and its 
interaction with Q, but this is not necessary. The only 
thing that matters for the description of noisy channels 
is the dynamics of Q. For any given quantum operation 
£ there are many possible representations of £ in terms 
of environments E and interactions U^^ . We always 
assume that the initial state of -E is a pure state, and re- 
gard E as a, mathematical artifice. Of course, the actual 
physical environment, Ea, may be initially impure, but 
the above representation theorem shows that for the pur- 
poses of describing the dynamics of Q, it can be replaced 
by an "environment" , E, which is initially pure and gives 
rise to exactly the same dynamics. In what follows it is 
this latter E that is most useful. 

Shannon's classical noisy coding theorem is proved for 
discrete memoryless channels. Discrete means that the 
channel only has a finite number of input and output 
states. By analogy we define a discrete quantum channel 
to be one which has a finite number of Hilbert space di- 
mensions. In the classical case, memoryless means that 
the output of the channel is independent of the past, con- 
ditioned on knowing the state of the source. Quantum 
mechanically we take this to mean that the output of the 
channel is completely determined by the encoded state 
of the source, and is not affected by the previous history 
of the source. 

Phrased in the language of quantum operations, we as- 
sume that there is a quantum operation. A/", describing 



the dynamics of the channel. The input pi of the channel 
is related to the output po by the equation 



Pi -^ Po ^M{pi). 



(3.9) 



For the majority of this paper we assume, as in the previ- 
ous equation, that the operation describing the action of 
the channel is trace-preserving. This corresponds to the 
physical assumption that no classical information about 
the state of the system or its environment is obtained 
by an external classical observer. All previous work on 
noisy channel coding with the exception of [|4| has as- 
sumed that this is the case, and we do so for the major- 
ity of the paper. In section XI we consider the case of a 
noisy channel which is being observed by some classical 
observer. 

In addition to the environment, E^ it is also extremely 
useful to introduce a reference system, R, in the following 
way. One might imagine that the system Q is initially 
part of a larger system, RQ, and that the total is in a 
pure state, |V'^'^)j satisfying 



nQ - 



trfl(|V^«'^)(V^^^|) 



(3.10) 



Such a state \ip^'^) is called a purification of p'^ , and it 
can be shown [|l5[ that such a system R and purifica- 
tions 1^^*^) always exist. From our point of view R is 
introduced simply as a mathematical device to purify the 
initial state. The joint system RQ evolves according to 
the dynamics Jr (g) £ given by 



^R'Q' — 



{lR®£){p- 



RQ\ 



(3.11) 



The overall picture we have of a trace-preserving quan- 
tum operation thus looks like: 
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FIG. 3. 



The picture we have described thus far applies only to 
trace-preserving quantum operations. Later in the paper 
we will also be interested in quantum operations which 
are not trace-preserving. That is, they do not satisfy the 
relation X^i ^M* = ^' ^'^d thus tr(f (p)) ^ 1 in general. 
Such quantum operations arise in the theory of general- 
ized measurements. To each outcome, to, of a measure- 
ment there is an associated quantum operation, Em, with 
an operator-sum representation. 



^m(p)=5I^™^^™ 



(3.12) 



The probability of obtaining outcome ni is postulated to 
be 



Pr(7TT,) 



tr(f„(p))=tr(^A 



-^miPj- 



(3.13) 



The completeness relation for probabilities 
^j^ Pr(TO) = 1 is equivalent to the completeness re- 
lation for the operators appearing in the operator-sum 
representations 



•Al 



A — T 

/^m2 ^ ■ 



Thus for each to, 



i 



A-miA-mi < I- 



(3.14) 



(3.15) 



As an aside, it is interesting to note that the formulation 
of quantum measurement based on the projection pos- 
tulate p6|-E8l, taught in most classes on quantum me- 
chanics, is a special case of the quantum operations for- 
malism, obtainable by using a single projector Am = Pm 
in the operator-sum representation for £,„. The formal- 
ism of positive operator valued measures (POVMs) |1^ ] 
is also related to the generalized measurements formal- 
ism: Em = Yl,i AmiAmi are the elements of the POVM 
which is measured. 

A result analogous to the earlier representation the- 
orem for trace-preserving quantum operations can be 
proved for general operations. 

Theorem (general representation theorem for opera- 
tions) 

Suppose £ is a general quantum operation. Then it 
is possible to find an environment, E, initially in a pure 
state a = \s){s\ uncorrelated with the system, a unitary 
U^^ , a projector P^ onto the environment alone, and a 
constant c > 0, such that 



£{p) = ctiE{P^U^^{p <E) a)U^^^P^). 



(3.16) 



Furthermore, in the case of a generalized measurement 
described by operations £m it is possible to do so in such 
a way that for each to. the corresponding constant Cm = 1, 



and the projectors P^ form a complete orthogonal set, 

y^ pE — T pEpE Sj pF 



Conversely, any map of the form ( |3.16 ) is a quantum 
operation. 

Once again, introducing a reference system R which 
purifies p*^ we are left with a picture of the dynamics 
that looks like this: 
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\q,RQ'j 



Q' 



^pEuQE 



E 



notation £"1, £2, ■ ■ • for the different operations, and 
the notation £2 o £1 to denote composition of oper- 
ations. 



{£2o£,){p)^£2{£^[p)). 



(3.18) 



Furthermore it is sometimes useful to use the RQE 
picture of quantum operations to discuss composi- 
tions. We denote the environment corresponding to 
operation £i by Ei, and assume environments cor- 
responding to different values of i are independent 
and initially pure. So, for example, the initial state 
for a two-stage composition would be 



P 



RQE1E2 _ \j,RQ\/j,RQ 



\,p^Q){,p^Q\(g>\s,){s,\(g> 



S2){S2\- 

(3.19) 



A single prime denotes the state of the system af- 
ter the application of £1, and a double prime de- 
notes the state of the system after the application 
of £"2 ° iS"!, and so on. 



FIG. 4. 



IV. ENTROPY EXCHANGE 



A few miscellaneous remarks will be useful later on. 

1. A prime always denotes a normalized state. For 
instance. 



tr{{lR®£){pRQ)y 



(3.17) 



2. The total state of the system RQE starts and re- 
mains pure. That is, p^Q ^ is a pure state. Purity 
gives very useful relations amongst Von Neumann 
entropies S{p) = — tr(plog2 p), such as S{p^ ^ ) = 
S{p^ ) and all other permutations amongst _R, Q 
and E. These are used frequently in what follows. 

3. Generically we denote quantum operations by £ 
and the dimension of the quantum system Q by 
d. 

4. Trace-preserving quantum operations arise when a 
system interacts with an environment, and no mea- 
surement is performed on the system plus environ- 
ment. Non trace-preserving operations arise when 
classical information about the state of the sys- 
tem is made available by such a measurement. For 
most of this paper the noisy quantum channel is de- 
scribed by a trace-preserving quantum operation. 

5. Sometimes we consider the composition of two (or 
more) quantum operations. Generically we use the 



This section briefly reviews the definition and some 
basic results about the entropy exchange^ which was in- 
dependently introduced by Schumacher |j] and Lloyd . 
The entropy exchange turns out to be central to under- 
standing the noisy quantum channel. 

The entropy exchange of a quantum operation £ with 
input p is defined to be 



— cl„E'\ 



Se{p,£)^S{p^) 



(4.1) 



where p^ is the state of an initially pure environment 
(the "mock" environment of the previous section) after 
the operation, and S{p) = —ti{p\og2p) is the Von Neu- 
mann entropy. If £(p) = X^i -^iP^l then a convenient 
form for the entropy exchange is found by defining a ma- 
trix W with elements 



W,, 



tijA^pA]) 
tri£{p)) ■ 



It can be shown [WJl4 that 



Se{p,£) = S{W) = -tr(I^log2l^). 



(4.2) 



(4.3) 



The last equation is frequently useful when performing 
calculations. 



V. CLASSICAL NOISY CHANNELS IN A 
QUANTUM SETTING 

In this section we show how classical noisy channels 
can be formulated in terms of quantum mechanics. We 
begin by reviewing the formulation in terms of classical 
information theory. 

A classical noisy channel is described in terms of dis- 
tinguishable channel states, which we label by x. If the 
input to the channel is symbol x then the output is sym- 
bol y with probability Py\x- The channel is assumed to 
act independently on each input. For each x, the proba- 
bility sum rule ^ Pyi^ = 1 is satisfied. These conditional 
probabilities Pyi^ completely describe the classical noisy 
channel. 

Suppose the input to the channel, x, is represented by 
some classical random variable, X, and the output by a 
random variable Y . The mutual information between X 
and Y is defined by 



-En- 



H{X : Y) = H{X) + H{Y) - H{X, Y), 



(5.1) 



where H{X) is the Shannon information of the random 
variable, X, defined by 



i/(X) = -^p(x)log2p(x), 



(5.2) 



with log2 = limp_+o P log2 P ~ ^■ 

Shannon showed that the capacity of a noisy classical 
channel is given by the expression 



Cs = maxi?(X:r), 

p(x) 



(5.3) 



where the maximum is taken over all possible distribu- 
tions p{x) for the channel input, X. Notice that although 
this is not an explicit expression for the channel capacity 
in terms of the conditional probabilities Px\y, the maxi- 
mization can easily be performed using well known tech- 
niques from numerical mathematics. That is, Shannon's 
result provides an effective procedure for computing the 
capacity of a noisy classical channel. 

All these results may be re-expressed in terms of quan- 
tum mechanics. We suppose the channel has some pre- 
ferred orthonormal basis, jx), of signal states. For conve- 
nience we assume the set of input states, \x), is the same 
as the set of output states, \y), of the channel, although 
more general schemes are possible. For the purpose of 
illustration the present level of generality suffices. A clas- 
sical input random variable, X , corresponds to an input 
density operator for the quantum channel. 



Px 



E 



p{x)\x){x\. 



(5.4) 



The statistics of X are recoverable by measuring px in 
the I a;) basis. Defining operators E^y by 



\y){A 



(5.5) 



we find that the channel operation defined by 

N[p)=Y,Py\xExypEly. (5.6) 



xy 



is a trace-preserving quantum operation, and that 
J^{px) ^ PY ^ ^p{y)\y){y\, 



(5.7) 



where py is the density operator corresponding to the 
random variable Y that would have been obtained from 
X given a classical channel with probabilities Py\x- This 
gives a quantum mechanical formalism for describing 
classical sources and channels. It is interesting to see 
what form the mutual information and channel capacity 
take in the quantum formalism. 
Notice that 



H{X) = S{px) 

H{Y) = S{py) = S{N{px)). 



(5.8) 
(5.9) 



Next we compute the entropy exchange associated with 
the channel opera ting on input px , by computing the W 
matrix given by (4^). The W matrix corresponding to 
the channel with input px has entries 



W(xy)(x'y') ^ 5x,x'5y^y'p{x)p{y\x), 



(5.10) 



But the joint distribution of (X, Y) satisfies p{x)p{y\x) — 
p{x,y). Thus W is diagonal with eigenvalues p{x,y), so 
the entropy exchange is given by 



Se{pxM) = H{X,Y). 



(5.11) 



It follows that 



H{X : Y) = S(px) + S{N(px)) - S,{px,Af), (5.12) 

and thus the Shannon capacity Cs of the classical chan- 
nel is given in the quantum formalism by 



Cs = max[5(px) + S{N{px)) ~ S,[pxM)] 

Px 



(5.13) 



where the maximization is over all input states for the 
channel, px, which are diagonal in the \x) basis. 

The problem we have been considering is that of trans- 
mitting a discrete set of orthogonal states (the states \x)) 
through the channel. In many quantum applications one 
is not only interested in transmitting a discrete set of 
states, but also arbitrary superpositions of those states. 
That is, one wants to transmit entire subspaces of states. 
In this case, the capacity of interest is the maximum rate 
of transmission of subspace dimensions. This may occur 
in quantum computing, cryptography and teleportation. 
It is also interesting in these applications to transmit the 



entanglement of states. This cannot be done by consid- 
ering the transmission of a set of orthogonal pure states 
alone. 

It is not difhcult to see that Cs is not correct as a mea- 
sure of how many subspace dimensions may be reliably 
transmitted through a quantum channel. For example 
consider the classical noiseless channel, 



JVip)^J2\'^){x\p\x){a 



(5.14) 



where \x) is an orthonormal set of basis states for the 
channel. It is easily seen that 



Q 



s 



log2 d, 



(5.15) 



where d is the number of channel dimensions. Yet it is 
intuitively clear, and is later proved in a more rigorous 
fashion, that such a channel cannot be used to transmit 
any non-trivial subspace of state space, nor can it be used 
to transmit any entanglement, and thus its capacity for 
transmitting these types of quantum resources is zero. 



VI. ENTANGLEMENT FIDELITY 

In this section we review a quantity known as the en- 
tanglement fidelity y . It is this quantity which we use to 
study the effectiveness of schemes for sending information 
through a noisy quantum channel. 

The entanglement fidelity is defined for a process, spec- 
ified by a quantum operation £ acting on some initial 
state, p. We denote it by Fe{p,£). The concerns mo- 
tivating the definition of the entanglement fidelity are 
twofold: 

1. Fe{p, £) measures how well the state, p, is preserved 
by the operation £. An entanglement fidelity close 
to one indicates that the process preserves the state 
well. 

2. Fe{p,£) measures how well the entanglement of p 
with other systems is preserved by the operation £. 
An entanglement fidelity close to one indicates the 
process preserves the entanglement well. 

Conversely, an entanglement fidelity close to zero indi- 
cates that the state or its entanglement were not well 
preserved by the operation £. 

Formally, the entanglement fidelity is defined by 



F,{p,£) ^ {xb'''J\iIn(^£m'"^){r"^\M 



RQ\/,i,RQ\ 



i,RQ\ 



(6.1) 



That is, the entanglement fidelity is the overlap between 
the initial purification lip^*^) of the state before it is sent 
through the channel with the state of the joint system 
RQ after it has been sent through the channel. The en- 
tanglement fidelity depends only on p and £, not on the 



particular purification |-0^'3^ of p that is used |j]. If £ has 
operation elements {Ai} then the entanglement fidelity 
has the expression MMl 



Fe{p,£)^ 



tT{£{p)) ■ 



(6.2) 



This expression simplifies for trace-preserving quantum 
operations since the denominator is one. The entangle- 
ment fidelity has the following properties P,p|,p^ . 

1. Q<Fe{p,£)<l. 

2. Fe{p,£) = 1 if and only if for all pure states jV') 
lying in the support of p, 



£m{i^\) = m{^j\. 



(6.3) 



3. The entanglement fidelity is a lower bound on the 
fidelity defined by Jozsa |19| in the following sense, 



F,ip,£)<Fip,£{p)). 



(6.4) 



4. Suppose {\'4'i)TPi} is an ensemble realizing p, 

p=J2P^\^^)i^^\■ (6-5) 

i 

Then the entanglement fidelity is a lower bound on 
the average fidelity for the pure states \ipi), 

Fe{p,£) < J2p^m£{\^^){^/J^\)\^P^). (6.6) 



5. Again suppose {\ipi),Pi} is an ensemble realizing p. 
Then if the pure state fidelity {■>p\£{\ip){ip\)\->p) > 
1 — ?7 for all IV') in the support of p, Fe{p,£) > 
1 - (3/2)77. (KniU and LaFlamme |§.) 

There are several reasons for using the entanglement 
fidelity as our measure of success in transmitting quan- 
tum states. If we succeed in sending a source ps with 
high entanglement fidelity, we can send any ensemble 
for Ps with high average pure-state fidelity, by item 4 
above. Entanglement fidelity is thus a more severe re- 
quirement of quantum coherence than average pure-state 
fidelity. Moreover, the ability to preserve entanglement 
is of great importance in applications of quantum coding 
to, say, quantum computation, where one would like to 
be able to apply error-correction in a modular fashion to 
small portions of a quantum computer despite the fact 
that they may, in the course of quantum computation, 
become entangled with other parts of the computer pi| . 
(Of course, the general problem of finding the capacity of 
a noisy quantum channel for a given ensemble with av- 
erage pure-state fidelity as the reliability measure is also 
worth investigating.) 



An appropriate measure of how well a subspace of 
quantum states is transmitted is the subspace fidelity 



Fs{P,£) 



min(^|f(|^)(V'|)|^), 



(6.7) 



where the minimization is over all pure states {ip) in the 
subspace whose projector is P. Item 5 above implies that 
if the subspace fidelity is close to one, the entanglement 
fidelity is also close to one. The converse is not in general 
true. That is, reliable transmission of subspaces is a more 
stringent requirement than transmission of entanglement. 
Therefore using entanglement fidelity as our criterion for 
reliable transmission yields capacities at least as great 
as those obtained when subspace fidelity is used as the 
criterion. We conjecture that these two capacities are 
identical. 

As an alternative measure of subspace fidelity, one 
might consider the average pure state fidelity 



d|^)(V'|£(|V)(^l)IV^), 



(6.8) 



where the integration is done using the unitarily invariant 
measure on the subspace of interest. By item 4 above, 
the capacity resulting from this measure of reliability is 
at least as great as that which results when entanglement 
fidelity is used as the measure of reliability. We do not 
know whether these two capacities are equal. 

The lesson to be learnt from this discussion is that 
there are many different measures which may be used 
to quantify how reliably quantum states are transmitted, 
and different measures may result in different capacities. 
Which measure is used depends on what resource is most 
important for the application of interest. For the remain- 
der of this paper, we use the entanglement fidelity as our 
measure of reliability. 

There is a very useful inequality, the quantum Fano 
inequality^ which relates the entropy exchange and the 
entanglement fidelity. It is |j]: 

S,{p,E) < h(F,{p,E)) + (1 - F,{p,E))\og^{£ - 1), 

(6.9) 

where h{p) = —plogp — (1 — p) log(l — p) is the dyadic 
Shannon information associated with p. It is useful 
to note for our later work that < h{p) < 1 and 
log(d^ — 1) < 2 logd, so from the quantum Fano inequal- 
ity, 



Seip,£) < 1 + 2(1- F,ip,£))logd. 



(6.10) 



The proof of the quantum Fano inequality, (3.9), 
is simple enough that for convenience we repeat it 
here. Consider an orthonormal set of d^ basis states, 
jf/'i), for the system RQ. This basis set is chosen so 
lipi) = \^^^). If we form the quantities pi = 



{t/ji\p^' '^ \fpi), then it is possible to show (see for example 
II, page 240) 



5(p^'«')<i?(pi,...,Pd2), 



(6.11) 



where H{pi) is the Shannon information of the set pi. 
But by easily verified grouping properties of the Shan- 
non entropy. 



H{p,,...,pa2)^h{p,) + {l-p,)H{ 



P2 



P<P 



1-pi 



(6.12) 

■2 



and it is easy to show that H{j^^, . . . , jz^) < \og{d 
1). Combining these results and noting that pi — 
Fe{p,£) by definition of the entanglement fidelity, 

S,{p,E) < h{F,{p,E)) + (1 - F,{p,E))\og{d^ - 1), 

(6.13) 

which is the quantum Fano inequality. 

For applications it is useful to understand the continu- 
ity properties of the entanglement fidelity. To that end 
we prove the following lemma: 

Lemma (continuity lemma for entanglement fidelity) 
Suppose £ is a trace-preserving quantum operation, p 
is a density operator, and A is a Hermitian operator with 
trace zero. Then 

|Fe(p + A,£)-^e(p,f)|<2tr(|A|)+tr(|A|)2. 

(6.14) 

To prove the lemma we apply ( |6.2[ ) to obtain 

\F,{p + ^,E)-F,{p,E)\ < 2J2MAP)\ |tr(4A)| + 

i 

^|tr(A,A)p. (6.15) 

i 

Applying a Cauchy-Schwarz inequality to each sum, the 
first with respect to the complex inner product J^i^iVi^ 
the second with respect to the Hilbert-Schmidt inner 
product tr(A'^y), we obtain 

\F,{p + A,E)-F,{p,E)\< 

2(^|tr(A,p)p^|tr(A]A)| 
\ * i 

5]|tr(A,|A|4)||tr(|A|)|, 



+ 



(6.16) 



where |A| ee VAtA. Applying (|6j) and Fe{p,E) <1 to 
the first sum and the trace-preserving property of E to 
the final sum gives 



\F,{p + A, £) - F,(p, E)\ < 2 J2 |tr(A] A)P + tr(| A|)2. 

(6.17) 



One final application of the Cauchy-Schwarz inequality 
and the trace-preserving property of £ gives 

|Fe(p + A,f)-i^e(p,f)|<2tr(|A|)+tr(|A|)2, (6.18) 

as required. 

This result gives bounds on the change in the entan- 
glement fidelity when the input state is perturbed. Note, 
incidentally, that during the proof a coefficient y^Fe{p,£) 
was dropped from the first term on the right hand side 
of the inequality. For some applications it may be useful 
to apply the inequality with this coefhcient in place. 



VII. COHERENT INFORMATION 

In this section we investigate the coherent information. 
The coherent information was defined in |g] , where it was 
suggested that the coherent information plays a role in 
quantum information theory analogous to the role played 
by mutual information in classical information theory in 
the following sense. Consider a classical random process. 



X^Y. 



(7.1) 



in which the random variable X is used as the input to 
some process which produces as output the random vari- 
able Y. The distributions of X and Y are related by a 
linear map, M, determined by the conditional probabil- 
ities of the process. An example of such a process is a 
noisy classical channel with input X and output Y. As 
discussed earlier, an important quantity in information 
theory is the mutual information, H{X : Y), between 
the input X and the output Y of the process. Note that 
H{X : Y) can be regarded as a function of the input X 
and the map A4 only, since the joint distribution of X 
and Y is determined by these. 

Quantum mechanically we can consider a process de- 
fined by an input p, and output p' , with the process de- 
scribed by a quantum operation, £, 



p^p'^Sip). 



(7.2) 



We assert that the coherent information, defined by 



I{p,£) = S 



tr{£{p))) 



Se{p,£), 



(7.3) 



plays a role in quantum information theory analogous 
to that played by the mutual information H(X : Y) in 
classical information theory. This is not obvious from 
the definition, and one goal of this section is to make 
it appear plausible that this is the case. Of course, the 
true justification for regarding the coherent information 
as the quantum analogue of the mutual information is 
its success as the quantity appearing in results on chan- 
nel capacity, as discussed in later sections. This is the 



true motivation for all definitions in information theory, 
whether classical or quantum: their success at quanti- 
fying the resources needed to perform some interesting 
physical task, not some abstract mathematical motiva- 
tion. 



In subsection VII A we review the data processing in- 
equality which provides motivation for regarding the co- 
herent information as a quantum analogue of the mutual 
information, and whose application is crucial to later rea- 
soning. Subsection VII B studies in detail the properties 



of the coherent information. In particular, we prove sev- 
eral results related to convexity that are useful both as 
calculational aids, and also for proving later results. Sub- 



section VII C proves a lemma about the entanglement 



fidelity that glues together many of our later proofs of 
upper bounds o n the c hannel capacity. Finally, subsec- 
tions VHP and VII E describe two important ways the 
behaviour of the coherent information differs from the 
behaviour of the mutual information when quantum en- 
tanglement is allowed. 



A. Quantum Data Processing Inequality 

The role of coherent information in quantum informa- 
tion theory is intended to be similar to that of mutual 
information in classical information theory. This is not 
obvious from the definition, but can be given an opera- 
tional motivation in terms of a procedure known as data 
processing. The classical data processing inequality g] 
states that any three variable Markov process, 



X ^Y 



Z, 



satisfies a data processing inequality, 

HiX) > H{X : Y) > H{X : Z). 



(7.4) 



(7.5) 



The idea is that the operation Y ^ Z represents some 
kind of "data processing" of Y to obtain Z, and the mu- 
tual information after processing, H{X : Z), can be no 
higher than the mutual information before processing, 
H{X : Y). Furthermore, suppose we have a Markov pro- 



X^Y, 



(7.6) 



such that H{X) = H{X : Y). Intuitively, one might ex- 
pect that it be possible to do data processing on Y to 
recover X. It is not difficult to show that it is possible, 
using Y alone, to construct a third variable, Z , forming 
a third stage in the Markov process. 



X ^Y 



(7.7) 



such that X = Z with probability one, if and only if 
H{X) = H{X : Y). 



An analogous quantum result has been proved by 
Schumacher and Nielsen Q. It states that given trace- 
preserving quantum operations £i and £2 defining a 
quantum process, 



p^£i{p) -* {£2o£i){p), 



(7.8) 



then 



Sip)>I{p,£i)>Iip,£2o£i). (7.9) 

Furthermore, it was shown in M that given a process 

P^£i{p), (7.10) 

it is possible to find an operation £2 which reverses £1 if 
and only if 



S{p)^Iip,£i). 



(7.11) 



The close analogy between the classical and quantum 
data processing inequalities provides a strong operational 
motivation for considering the coherent information to 
be the quantum analogue of the classical mutual infor- 
mation. 

The proof of the quantum data processing inequality 
is repeated here in order to address the issue of what 
happens when £1 and £2 are not trace-preserving. The 
proof of the first inequality is to apply the subadditivity 
inequality || S^p"''^') < S'(p^') + ^(p^') in the RQE 
picture of operations to obtain 



{p,£i) = S{£i{p))-Se{p,£i) 


(7.12) 


= SipQ')-Sip^') 


(7.13) 


^S{p^'^')~S{p^') 


(7.14) 


< 5(p«') = S{p^) = S{p). 


(7.15) 



It is clear that this part of the inequality need not hold 
when £1 is not trace-preserving. The reason for this is 
that it is no longer necessarily the case that p^ = p^, 
and thus it may not be possible to make the identi- 
fication S{p^ ) — S{p^). For example, suppose we 
have a three dimensional state space with orthonormal 
states |1), |2) and |3). Let P12 be the projector onto 
the two dimensional subspace spanned by |1) and |2), 
and P3 the projector onto the subspace spanned by |3). 
Let p = f P12 + (1 - p)P3, where < p < 1, and 
£{p) = Pi2pPi2- Then by choosing p small enough we 
can make S{p) ~ 0, but /(p, £) = 1, so we have an exam- 
ple of a non trace-preserving operation which does not 
obey the data processing inequality. 

The proof of the second part of the data processing 
inequality is to apply the strong subadditivity inequality 

n, 

S{p^"^i^2) + S{p^") < S{p^''^") + 5(p^"^^'), (7.16) 



where we are now using an RQE1E2 picture of the op- 
erations. From purity of the total state of RQE1E2 it 
follows that 



Sip 



R"E'^E'4\ _ 



) = ^(P^ ) 



(7.17) 



Neither of the systems R or Ei are involved in the second 
stage of the dynamics in which Q and E2 interact unitar- 
ily. Thus, their state does not change during this stage: 
pR"E'; ^ pR'E[_ By^ fj-oj^ ^}jg purity of RQEi after the 

first stage of the dynamics, 



5(p«"^i')^^(pfl'si)^5(pQ'). 



(7.18) 



The remaining two terms in the subadditivity inequality 
are now recognized as entropy exchanges. 



5(p^")=5(p^i)=^e(p,fl), 

S{p'''^''-)^S,{p,£2o£^). 



(7.19) 
(7.20) 



Making these substitutions into the inequality obtained 
from strong subadditivity ([7.1(i|) yields 



S{p'^") ^ Se{p, £1) < S{p'^') ^ Se{p, £20 £1), (7.21) 

which can be rewritten as the second stage of the data 
processing inequality. 



I{p,£i)>I{p,£2o£i). 



(7.22) 



Notice that this inequality holds provided £2 is trace- 
preserving, and does not require any assumption that £1 
is trace-preserving. This is very useful in our later work. 



B. Properties of Coherent Information 

The set of completely positive maps forms a positive 
cone, that is, if £i is a collection of completely posi- 
tive maps and A; is a set of non-negative numbers then 
^j Xi£i is also a completely positive map. In this sec- 
tion we prove two very useful properties of the coherent 
information. First, it is easy to see that for any quantum 
operation £ and non-negative A, 



I{p,\£) = I{p,£). 



(7.23) 



This follows immediately from the definition of the co- 
herent information. A slightly more difficult property to 
prove is the following. 
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Theorem (generalized convexity theorem for coherent 
information) . 

Suppose Ei are quantum operations. Then 



I{P,Y.^^. 



< 



E.tr(g4p))/(p,g.O 



(7.24) 



This result is extremely useful in our later work. An 
important and immediate corollary is the following: 

Corollary (convexity theorem for coherent informa- 
tion) 

If a trace-preserving operation, £ — ^ • piSi is a convex 
sum {pi > 0,J2iPi = 1) of trace-preserving operations £i, 
then the coherent information is convex, 



I{p,^Pi£i) < ^Pil{p,£i 



(7.25) 



The proof of the corollary is immediate from the the- 
orem. The theorem follows from the concavity of the 
conditional entropy (see references cited in [ p2[ , pages 
249-250), which for two systems, 1 and 2, is defined by 

S{2\l) = S{pi2)-S(tr2{pi2))- (7.26) 

This expression is concave in pi2. Now notice that 

I{p, £) = S{pQ') - 5(p«'Q') = -S{Ii\Q'). (7.27) 

The theorem now follows from the concavity of the con- 
ditional entropy. 

A further useful result concerns the additivity of co- 
herent information, 

Theorem (additivity for independent channels) 
Suppose Sit . . ,£n are quantum operations and 
pi, . . . , Pn are density operators. Then 



I{pi ® 



>p„,£i (g) ...£„) = y^I{pi,£i 



(7.28) 



The proof is immediate from the additivity property of 
entropies for product states. 



C. A Lemma About Entanglement Fidelity 

The following lemma is the glue which holds together 
much of our later work on proving upper bounds to chan- 
nel capacities. In this section we prove the lemma only for 
the special case of trace-preserving operations. A similar 
but more complicated result is true for general opera- 
tions, and is given in section 



Xl 



We begin by repeating the proof of a simple inequality 
that was first proved in W , which states that the decrease 
(if any) in system entropy must be bounded above by the 
increase in the entropy of a pure environment. This ap- 
plies only for trace-preserving operations £. Applying the 
subadditivity inequality pi S(p'^'^') < S{p'^') + S{p^') 



and the relationship S{p^ ) — S{p'^ ^ ) which follows 
from purity we obtain 



S{p) = Sip'') 
= S{p^') 
= S{pQ''^') 



„E'\ 



< Sip"^ ) + S{p^ ). 
Rewriting this slightly gives 

S{p)-S{£{p))<S,{p,£), 



(7.29) 
(7.30) 
(7.31) 

(7.32) 



(7.33) 



for any trace-preserving quantum operation £. 

Lemma (entanglement fidelity lemma for operations) 
Suppose 5 is a trace-preserving quantum operation, 
and p is some quantum state. Then for all trace- 
preserving quantum operations T>, 

S{p) < I{p, f) + 2 + 4(1 - Feip, V o £)) logd. (7.34) 

This lemma is extremely useful in obtaining proofs of 
bounds on the channel capacity. In order for the entan- 
glement fidelity to be close to one, the quantity appearing 
on the right hand side must be close to zero. This shows 
that the entropy of p cannot greatly exceed the coherent 
information I{p, £) if the entanglement fidelity is to be 
close to one. 

To prove the lemma, notice that by the second part of 
the data processing inequality, (7.9), 



Sip) - lip, £) < Sip) - SiiV o £)ip)) + 5e(p, Vo£). 

(7.35) 



Applying inequality ( 7.33| ) gives 

Sip) ~ SiiV o£)ip))<S,ip,Vo£), 

and combining the last two inequalities gives 

Sip)-Iip,£)<2S,ip,Vo£) 

< 2hiF,ip,V o £)) + 

2il-F,ip,Vo£))\ogid^-l) 



(7.36) 



(7.37) 



(7.38) 



where the seco nd step follows from the quantum Fano 
inequality (|6.9| ). But the dyadic Shannon entropy h is 
bounded above by 1 and log((i^ — 1) < 2 logd, so 

Sip) < lip, £) -f 2 -I- 4(1 - Feip, V o £)) logd. (7.39) 

This completes the proof. 

This inequality is strong enough to prove the asymp- 
totic bounds which are of most interest for our later work. 



The somewhat stronger inequality (7.38) is also useful 
when proving one-shot results, that is, when no block 
coding is being used. 
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D. Quantum Characteristics of the Coherent 
Information I 



There are at least two important respects in which the 
coherent information behaves differently from the classi- 
cal mutual information. In this subsection and the next 
we explain what these differences are. 

Classically, suppose we have a Markov process, 



X -^Y 



Intuitively we expect that 



H{X : Z) < H{Y : Z), 



(7.40) 



(7.41) 



and, indeed, it is not difficult to prove such a "pipelining 
inequality" , based on the definition of the mutual infor- 
mation. The idea is that any information about X that 
reaches Z must go through Y, and therefore is also in- 
formation that Z has about Y. However, the quantum 
mechanical analogue of this result fails to hold. We shall 
see that the reason it fails is due to quantum entangle- 
ment. 

Example 1: 

Suppose we have a two-part quantum process de- 
scribed by quantum operations £i and £2- 



p^£i{p) -^ {£2o£i){p). 
Then, in general 

i{p.£2o£i)ti{£i{p).£2) 



(7.42) 



(7.43) 



An explicit example showing that this is the case is given 
below. It is not possible to prove a general inequality of 
this sort for the coherent information - examples may be 
found where a < , > or == sign could occur in the last equa- 
tion. We now show how the purely quantum mechanical 
effect of entanglement is responsible for this property of 
coherent information. 

Notice that the truth of the equation 



I{p,£2o£i)<I{£i{p),£2), 
is equivalent to 

Se{£l{p),£2) < Se{p,£2°£l)- 



(7.44) 



(7.45) 



This last equation makes it easy to see why (7.44) may 
fail. It is because the entropy of the joint environment 
for processes £1 and £2 (the quantity on the right-hand 
side) may be less than the entropy of the environment 
for process £2 alone (the quantity on the left). This is a 
property peculiar to quantum mechanics, which is caused 
by entanglement; there is no classical analogue. An ex- 
ample of this type of phenomenon is provided by an EPR 
pair, where the entropy of either system alone (one bit) 
is greater than that of the entire system, which is pure 
and thus has zero bits of entropy. 



An example of (7.43) is as follows. For convenience 
we use the language of coding and channel operations, 
since that language is most convenient later. £1 is to be 
identified with the coding operation, C, and £2 is to be 
identified with the channel operation, Af. 

Suppose we have a four dimensional state space. 
We suppose that we have an orthonormal basis 
|1), |2), |3), |4), and that P12 is the projector onto the 
space spanned by |1) and |2), and P34 is the projector 
onto the space spanned by |3) and |4). Let U he a uni- 
tary operator defined by 

[/^|3)(1| + |4)(2| + |1)(3| + |2)(4|. (7.46) 

The channel operation is 

AA(p) = Pi2pPi2 + U^PsipPsiU, (7.47) 

and we use an encoding defined by 

C(P) = ^Pl2pPl2 + \UP12PP12U^ + P34PP34- (7.48) 

It is easily checked that for any state p whose support 
lies wholly in the space spanned by |1) and |2), 



{AfoC){p) = p. 



It follows that 



I{p,UoC) = S{p). 
It is also easy to verify that 

IiC{p),M)=2S{p)-l. 
Thus there exist states p such that 

I{p,MoC)>I{C{p),U), 
providing an example of (7.43|). 



(7.49) 



(7.50) 



(7.51) 



(7.52) 



E. Quantum Characteristics of the Coherent 
Information II 



The second important difference between coherent in- 
formation and classical mutual information is related to 
the property known classically as subadditivity of mutual 
information. Suppose we have several independent chan- 
nels operating. Figure ^ shows the case of two channels. 
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Process 1 






Process 2 



FIG. 5. 
Dual classical channels operating on inputs Xi and X2 
produce outputs Yi and Y2. 

These channels are numbered 1, . . . , n and take as in- 
puts random variables Xi , . . . , X„ . The channels might 
be separated spatially, as shown in the figure, or in time. 
The channels are assumed to act independently on their 
respective inputs, and produce outputs Yi, . . . ,¥„■ It is 
not difficult to show that 

HiXi, . . . , X„ : Fi, . . . , r„) < ^ H{X, : Y,). (7.53) 

i 

This property is known as the subadditivity of mutual 
information. It is used, for example, in proofs of the 
weak converse to Shannon's noisy channel coding the- 
orem. We now show that the corresponding quantum 
statement about coherent information fails to hold. 

Example 2: There exists a quantum operation £ and 
a density operator pi2 such that 



l{pi2,£(^£)^l{pi,£) + l{P2,£), 



(7.54) 



where pi = tr2(/5i2) and p2 = tri(pi2) are the usual re- 
duced density operators for systems 1 and 2. 

An example of ( 7.54 ) is the following. Suppose system 
1 consists of two qubits, A and B. System 2 consists of 
two more qubits, C and D. As the initial state we choose 



Pl2-^®|^^^)(^^^|«f, 



(7.55) 



where \ip'^^) is a Bell state shared between systems B 
and D. 

The action of the channel on A and B is as follows: 
it sets bit B to some standard state, |0), and allows A 
through unchanged. This is achieved by swapping the 
state of B out into the environment. Formally, 



SiPAB)^ PA ®\0){0\- (7.56) 

The same channel is now set to act on systems C and D: 

£ipcD)^ PC (^\0){0l (7.57) 



A straightforward though slightly tedious calculation 
shows that with this channel setup 



and 



I{pi,£) = IiP2,£) = 0, 



Iipi2,£(g)£) 



(7.58) 



(7.59) 



Thus this setup provides an example of (7.54) 



VIII. NOISY CHANNEL CODING REVISITED 

In this section we return to noisy channel coding. Re- 
call the basic procedure for noisy channel coding, as il- 
lustrated in figure pi 



Source 




Channel 
Input 




Channel 
Output 


Decoding 

V 


Receiver 


P. 


c 


Pc 


N 


Po 


P. 



FIG. 6. 
The noisy quantum channel, together with encodings 
and decodings. 

Suppose a quantum source has output ps ■ A quantum 
operation, which we shall denote C, is used to encode 
the source source, giving the input state to the chan- 
nel, pc = C{ps)- The encoded state is used as input to 
the noisy channel, giving a channel output po = M{pc)- 
Finally, a decoding quantum operation, P, is used to de- 
code the output of the channel, giving a received state, 
Pr = 'D{po). The goal of noisy channel coding is to find 
out what source states can be sent with high entangle- 
ment fidelity. That is, we want to know for what states 
Ps can encoding and decoding operations be found such 
that 



Fe{ps,VoJ\foC) w 1. 



(8.1) 



If large blocks of source states with entropy R per use of 
the channel can be sent through the channel with high 
fidelity, we say the channel is transmitting at the rate R. 
Shannon's noisy channel coding theorem is an example 
of a channel capacity theorem. Such theorems come in 
two parts: 

1. An upper bound is placed on the rate at which in- 
formation can be sent reliably through the channel. 
There should be an effective procedure for calculat- 
ing this upper bound. 

2. It is proved that a reliable scheme for encoding and 
decoding exists which comes arbitrarily close to at- 
taining the upper bound found in 1. 
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This maximum rate at which information can be rehably 
sent through the channel is known as the channel capac- 
ity. 

In this paper we consider only the first of these tasks, 
the placing of upper bounds on the rate at which quan- 
tum information can be reliably sent through a noisy 
quantum channel. The results we prove are analogous to 
the weak converse of the classical noisy coding theorem, 
but cannot be considered true converses until attainabil- 
ity of our bounds is demonstrated. We do consider it 
likely that our bounds are equal to the true quantum 
channel capacity. 



A. Mathematical Formulation of Noisy Channel 
Coding 

Up to this point the procedure for doing noisy chan- 
nel coding has been discussed in broad outline but we 
have not made all of our definitions mathematically pre- 
cise. This subsection gives a precise formulation for the 
most important concepts appearing in our work on noisy 
channel coding. 

Define a quantum source S = (_ffs,T) to consist of a 
Hilbert space Hg and a sequence T — {pl,p^, ...,p", ...} 
where pi is a density operator on Hg, p^ a density op- 
erator on Hs (E) Hs, and p" a density operator on Hf'"', 
etc... Using, for example, "tr34" to denote the partial 
trace over the third and fourth copies of Hs, we require 
as part of our definition of a quantum source that for all 
j and all n > j, 



tr 



i+i, 



.ApI) 



Pi 



i.e. that density operators in the sequence be consistent 
with each other in the sense that earlier ones be deriv- 
able from later ones by an appropriate partial trace. The 
n-th density operator is meant to represent the state of n 
emissions from the source, normally thought of as taking 
n units of time. (We could have used a single density 
operator on a countably infinite tensor product of spaces 
Hs, but we wish to avoid the technical issues associated 
with such products.) We define the entropy rate of a 
general source E as 



S{Ti) = limsup 



Sifi'^s) 



(8.3) 



A special case of this general definition of a quantum 
source is the i.i.d. source {Hs,{ps, Ps (^ Ps, ■■■, pf ", •■•}), 
for some fixed ps ■ Such a source corresponds to the classi- 
cal notion of an independent, identically distributed clas- 
sical source, thus the term i.i.d. The entropy rate of this 
source is simply S{ps)- 

A discrete memoryless channel, {Hc,M) consists of 
a finite-dimensional Hilbert space. He, and a trace- 
preserving quantum operation Af. The n-th extension 



of that channel is given by the pair (7?®", A/"*^"), where 
i^n is used to denote n-fold tensor products. The mem- 
oryless nature of the channel is refiected in the fact that 
the operation performed on the n copies of the channel 
system is a tensor product of independent single-system 
operations. 

Define an n-code (C,2?) from Hs into He to consist of 
a trace-preserving quantum operation, C, from _ff®" to 
Hf^, and a trace-preserving quantum operation D from 
Hf" to Hf^. We refer to C as the encoding and V as 
the decoding. 

The total coding operation T is given by 



T = VoM®'^oC. 



(8.4) 



The measure of success we use for the total procedure is 
the total entanglement fidelity. 



Fe{p:,T). 



(8.5) 



In practice we frequently abuse notation, usually by 
omitting explicit mention of the Hilbert spaces Hs and 
He. Note also that in principle the channel could have 
different input and output Hilbert spaces. To ease nota- 
tional clutter we do not consider that case here, but all 
the results we prove go through without change. 

Given a source state ps and a channel J\f, the goal 
of noisy channel coding is to find an encoding C and a 
decoding V such that Fe{ps,T) is close to one; that is, 
Ps and its entanglement is transmitted almost perfectly. 
In general this is not possible to do. However, Shannon 
showed in the classical context that by considering blocks 
of output from the source, and performing block encod- 
ing and decoding it is possible to considerably expand 
the class of source states ps for which this is possible. 
The quantum mechanical version of this procedure is to 
find a sequence of n-codes, (C", P") such that as n ^ oo, 
the measure of success Fe(p", T") approaches one, where 
T" = X>" o TV®" o C". (We will sometimes refer to such 
a sequence as a coding scheme.) 

Suppose such a sequence of codes exists for a given 
source S. In this case the channel is said to transmit S 
reliably. We also say that the channel can transmit reli- 
ably at a rate R — S'(S). (Note that this definition does 
not require that the channel be able to transmit reliably 
any source with entropy rate less than or equal to R; that 
is a different potential definition of what it means for a 
channel to transmit reliably at rate R. We conjecture 
that the two definitions are equivalent in the contexts 
considered in this paper.) 

A noisy channel coding theorem enables one to de- 
termine, for any source and channel, whether or not the 
source can be transmitted reliably on that channel. Clas- 
sically, this is determined by comparing the entropy rate 
of the source to the capacity of the channel. If the en- 
tropy rate of the source is greater than the capacity, the 
source cannot be transmitted reliably. If the entropy rate 
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is less than the capacity, it can. The conjunction of these 
two statements is precisely the noisy channel coding theo- 
rem. (The case when the entropy rate of the source equals 
the capacity requires separate consideration; sometimes 
reliable transmission is achievable, and sometimes not.) 
We expect that in quantum mechanics, the entropy rate 
S'(S) of the source will play the role of the classical en- 
tropy rate. A channel will be able to transmit reliably 
any source with entropy rate less than the capacity; fur- 
thermore, no source with entropy rate greater than the 
capacity will be reliably transmissible (i.e., the channel 
will be unable to transmit reliably at a rate greater than 
the capacity.) The first part of this would constitute a 
quantum noisy channel coding theorem; the second, a 
"weak converse" of the theorem. (A "strong converse" 
would require not just that no source with entropy rate 
greater than the capacity can be reliably transmitted, i.e. 
transmitted with asymptotic fidelity approaching unity, 
but would require that all such sources have asymptotic 
fidelity of transmission approaching zero.) 



Then 



lim i^e(p^2?"oAA®"oZY") = l. 



S'(S) = limsup^^ < C{N). 

71— ^OO ^ 



(9.3) 



(9.4) 



This theorem tells us that we cannot reliably trans- 
mit more than C{M) qubits of information per use of the 
channel. 

For unitary W" we have 



/(p„AA«" oZ^") ^ /(Z^"(p,),AA®"), 
and thus 

/(p„AA«"oZ^")<C„. 



(9.5) 



(9.6) 



By (JT^H) with £ = TV®" o Z^", and the fact that 
/(Z^"(p^),A/'®") < maxp/(p,7V®") = C„, it now follows 
that 



IX. UPPER BOUNDS ON THE CHANNEL 
CAPACITY 

In this section we investigate a variety of upper bounds 
on the capacity of a noisy quantum channel. 



A. Unitary Encodings 

This subsection is concerned with the case where the 
encoding, C, is unitary. 

For this subsection only we define 



C„=max/(/9,AA®"), 
p 



(9.1) 



where the maximization is over all inputs p to n copies of 
the channel. The bound on the channel capacity proved 
in this section is defined by 



C{M) = lim ^. 



(9.2) 



It is not immediately obvious that this limit exists. To 
see that it does, notice that C„ < n log2 d and Cm + Cn < 
Cm+n and apply the lemma proved in Appendix H. No- 
tice that C{N) is a function of the channel operation 
only. 

We begin with a theorem that places a limit on the 
entropy rate of a source which can be sent through a 
quantum channel. 

Theorem 

Suppose we consider a source E = (i/^, {..p"...}) and 
a sequence of unitary encodings Z-/" for the source. Sup- 
pose further that there exists a sequence of decodings, 
2?", such that 



n ~ n n 

4(1 - ^e(/5';,I?" o AA®" oZ^"))logd. (9.7) 

(Note that d here is the dimension of a single copy of the 
source Hilbert space, so that we have inserted d" for the 



overall dimension d of (7.34)). Taking limsups on both 
sides of the equation completes the proof of the theorem. 

It is extremely useful to study this result at length, 
since the basic techniques employed to prove the bound 
are the same as those that appear in a more elaborate 
guise later in the paper. In particular, what features 
of quantum mechanics necessitate a change in the proof 
methods used to obtain the classical bound? 

Suppose the quantum analogue of the classical subad- 
ditivity of mutual information were true, namely 



/(p",AA®")< V/(pf,AA), 



(9.8) 



where p" is any density operator that can be used as 
input to n copies of the channel, and pf is the density 
operator obtained by tracing out all but the i-ih chan- 
nel. Then it would follow easily from the definition that 
Cn = Ci for all n, and thus 



C{JV) = Ci =max/(p,AA). 
p 



(9.9) 



This expression is exactly analogous to the classical ex- 
pression for channel capacity as a maximum over input 
distributions of the mutual information between chan- 
nel input and output. If this were truly a bound on the 
quantum channel capacity then it would allow easy nu- 
merical evaluations of bounds on the channel capacity, as 
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the maximization involved is easy to do numerically, and 
the coherent information is not difficult to evaluate. 

Unfortunately, it is not possible to assume that the 
quantum mechanical coherent information is subadditive. 



as shown by example (7.54), and thus in general it is pos- 
sible that 



C{U) > Cx 



(9.10) 



We will later discuss results of Shor and Smolin p3[ which 
demonstrate that the channel capacity can exceed C\ . 

Notice that to evaluate the bound CiM) involves tak- 
ing the limit in ( |9.2| ). To numerically evaluate this limit 
directly is certainly not a trivial task, in general. The 



result we have presented, that (9.2) is an upper bound 
on channel capacity, is an important theoretical result, 
that may aid in the development of effective numerical 
procedures for obtaining general bounds. But it does not 
yet constitute an effective procedure. 



B. General Encodings 

We now consider the case where something more gen- 
eral than a unitary encoding is allowed. In principle, it is 
always possible to perform a non- unitary encoding, C, by 
introducing an extra ancilla system, performing a joint 
unitary on the source plus ancilla, and then discarding 
the ancilla. 

We define 



C„ = max/(p,AA®"oC), 
pfi 



(9.11) 



where the maximization is over all inputs p to the encod- 
ing operation, C, which in turn maps to n copies of the 
channel. The bound on the channel capacity proved in 
this section is defined by 



C{M) = lim — , 

n— >oc 77. 



(9.12) 



Once again, to prove that this limit exists one applies the 
lemma proved in Appendix |A|. 

To prove that this quantity is a bound on the channel 
capacity, one applies almost exactly the same reasoning 
as in the preceding subsection. The result is: 

Theorem Suppose we consider a source E = 
(i/s, {...p"...}) and a sequence of encodings C" for the 
source. Suppose further that there exists a sequence of 
decodings, I?" such that 



Then 



lim i^e(p",2?"oA/'«5"oC") = l. 



S{Y.) = hmsup^^^ < C{Af). 



(9.13) 



(9.14) 



Again, this result places an upper bound on the rate 
at which information can be reliably transmitted through 
a noisy quantum channel. The proof is very similar to 
the earlier proof of a bound for unitary encodings. One 
simply applies ( |7.34D with £ = TV^" o C" and V = V", 
to give: 

^iEll <9ll + 1+4^1^ F.iplW' o AA®" o C")) log2 d. 
n n n 

(9.15) 

Taking lim sups on both sides of the equation completes 
the proof. 

It is instructive to see why the proof fails when the 
maximization is done over channel input states alone, 
rather than over all source states and encoding schemes. 
The basic idea is that there may exist source states, ps, 
and encoding schemes C, for which 



I{p,UoC)>I{C{p),U)- 



(9.16) 



This possibility stems from the failure of the quantum 



pipelining inequality, (7.43). It is clear that the existence 
of such a scheme would cause the line of proof suggested 
above to fail. Classically the pipelining inequality holds, 
and therefore the complication of having to maximize 
over encodings does not arise. 

Having proved that C{J\f) is an upper bound on the 
channel capacity, let us now investigate some of the prop- 
erties of this bound. First we examine the range over 
which C{N) can vary. Note that 



< C„ < n log2 d. 



(9.17) 



since if p is pure then I{p,Af^"' oC) = for any encoding 
C, and for aU p and C, I{p,M'^''°C) < loga d" = nlog^ d, 
since the channel output has d" dimensions. It follows 
that 



< C(7V) < log2 d. 



(9.18) 



This parallels the classical result, which states that the 
channel capacity varies between and logj s, where s is 
the number of channel symbols. The upper bound on the 
classical capacity is attained if and only if the classical 
channel is noiseless. 

In the case when J\f takes a constant value. 



Af{p) = a, 



(9.19) 



for all channel inputs, p, it is not difficult to verify that 
C{N) = 0. This is consistent with the obvious fact that 
the capacity for quantum information of such a channel 
is zero. 

The "completely decohering channel" is defined by 



N{p) = Y,P^pP^, 



(9.20) 
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with Pi = \i){i\ a complete orthonormal set of one- 
dimensional projectors. This channel is classically noise- 
less, yet a straightforward application of ( [7.24 ) yields 
C{N) = 0, and therefore this channel has zero capac- 
ity for the transmission of entanglement. 

More generally, if N{p) — X^j^iP^I; where Ai = 
Ai|ai)(6i|, then C{M) = by the same argument, and 
thus the channel capacity for such a channel is zero. As 
a special case of this result, it follows that the capacity of 
any classical channel as defined in section M to transmit 
entanglement is zero. 

Provided the input and output dimensions of the chan- 
nel are the same, it is not difhcult to show that C{JV) = 
log2 d if and only if Af is unitary. 

It is also of interest to consider what happens when 
channels Afi and Af2 are composed, forming a concate- 
nated channel, Af = N2 ° Afi ■ From the data processing 
inequality it follows that 



C{JVi) > C{N). 



(9.21) 



It is clear by repeated application of the data-processing 
inequality that this result also holds if we compose more 
than two channels together, and even holds if we allow 
intermediate decoding and re-encoding stages. Classical 
channel capacities also behave in this way: the capacity 
of a channel made by composing two (or more) channels 
together is no greater than the capacity of the first part 
of the channel alone. 



Although (7.43) might seem to suggest otherwise, in 
fact 



C{U2) > C{U). 



(9.22) 



For let us suppose that C is the encoding which achieves 
C{N), so that the total operation isVoN oC = 'DoN2° 
Ml o C. As our encoding for the channel N2, we may use 
A/i o C and decode with 2?, hence achieving precisely the 
same total operation. 

Inequalities analogous to ( |9.21 ) and ( 9.22 ) may also 
be stated for the actual channel capacity. Clearly these 
statements are true as well. 



C. Other Encoding Protocols 

So far we have considered two allowed classes of encod- 
ings: encodings where a general unitary operation can be 
performed on a block of quantum systems, and encod- 
ings where a general trace-preserving quantum operation 
can be performed on a block of quantum systems. If 
large-scale quantum computation ever becomes feasible 
it may be realistic to consider encoding protocols of this 
sort. However, for present-day applications of quantum 
communication such as quantum cryptography and tele- 
portation, it is realistic to consider much more restricted 



classes of encodings. In this section we describe several 
such classes. 

We begin by considering the class involving local uni- 
tary operations only. We refer to this class as U-L. It 
consists of the set of operations C which can be written 
in the form 

C{p) = {Ui®---® Un)p{ul ®---® Ui), (9.23) 

where C/i , . . . , [/„ are local unitary operations on systems 
1 through n. Another possibility is the class L of encod- 
ings involving local operations only, i.e. operations of the 
form: 



(4 «<«•■■«<)• 



(9.24) 



In other words, the overall operation has a tensor product 
form A'S>B(g) ■■ -(g) Z. 

A more realistic class is 1-L - encoding by local opera- 
tions with one way classical communication. The idea is 
that the encoder is allowed to do encoding by perform- 
ing arbitrary quantum operations on individual mem- 
bers (typically, a single qubit) of the strings of quan- 
tum systems emitted by a source. This is not unrealis- 
tic with present day technology for manipulating single 
qubits. Such operations could include arbitrary unitary 
rotations, and also generalized measurements. After the 
qubit is encoded, the results of any measurements done 
during the encoding may be used to assist in the encoding 
of later qubits. This is what we mean by one way com- 
munication - the results of the measurement can only be 
used to assist in the encoding of later qubits, not earlier 
qubits. 

Another possible class is 2-L - encoding by local oper- 
ations with two-way classical communication. This may 
arise in a situation where there are many identical chan- 
nels operating side by side in space. Once again it is 
assumed that the encoder can perform arbitrary local op- 
erations, only this time two-way classical communication 
is allowed when performing the encoding. 

For any class of encodings A arguments analogous to 
those used above for general and for unitary block coding, 
ensure that the expression 



Ca(AA)= lim 



t^A,n 



where 



Ca,„= maxJ(p,AA®"oC), 
P,ceA 



(9.25) 



(9.26) 



is an upper bound to the rate at which quantum infor- 
mation can be reliably transmitted using encodings in 
A. Thus, in addition to the bounds for general and uni- 
tary encodings, there are bounds Cu-l,Cl,Ci-l, and 
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C2-L, which provide upper bounds on the rate of quan- 
tum information transmission for these types of encod- 
ings. A priori it is not clear what the exact relationships 
are amongst these bounds, although various inequalities 
may easily be proved, 

Cu-L < Cl < Ci_L < C2-L < C'general (9.27) 

Cu-L < Cynitary (9-28) 

*-^unitary — *^general' (i).Z\)) 

Furthermore, note that these bounds allow general de- 
coding schemes. It is possible that much tighter bounds 
may result if we restrict the decoding schemes in the same 
way we have restricted the encoding schemes. 

An interesting and important question is whether there 
are closed-form characterizations of the sets of quantum 
operations corresponding to particular types of encoding 
schemes such as 1-L and 2-L. For example, in the cases 
of U-L and L there are explicit forms ( p. 23 9.24) for the 
classes of encodings allowed. For 1-L the operations take 
the form: 



J2 {A,,®B,,,,^ 



Jp 



(4®4,^ 



.)■ (9.30) 



A drawback to this expression is that it is not written in a 
closed form, making it difficult to perform optimizations 
over 1-L. It would be extremely valuable to obtain a 
closed form for the set of operations in 1-L. One possible 
approach to doing this is to limit the range of the indices 
in the previous expression. This is related to the number 
of rounds of classical communication which are involved 
in the operation. Similar remarks to these also apply to 
the class 2-L. Indeed, it is not yet clear to us if there is 
an expression analogous to ( |9.30D for 2-L encodings. One 
possibility is: 



^{A, <E) B, <E) ■ ■ ■ <g) Zi)p{A\ b\ 



(9.31) 



However, although all 2-L operations involving a finite 
number of rounds of communication can certainly be put 
in this form, we do not presently see why all operations 
expressible in this form should be realizable with local 
operations and two-way classical communication. 

The classes we have described in this subsection are 
certainly not the only realistic classes of encodings. Many 
more classes may be considered, and in specific appli- 
cations this may well be of great interest. What we 
have done is illustrated a general technique for obtain- 
ing hounds on the channel capacity for different classes 
of encodings. A major difference between classical in- 
formation theory and quantum information theory is the 
greater interest in the quantum case in studying differ- 
ent classes of encodings. Classically it is, in principle. 



possible to perform an arbitrary encoding and decoding 
operation using a look-up table. However, quantum me- 
chanically this is far from being the case, so there is corre- 
spondingly more interest in studying the channel capac- 
ities that may result from considering different classes of 
encodings and decodings. 



X. DISCUSSION 

What then can be said about the status of the quantum 
noisy channel coding theorem in the light of comments 
made in the preceding sections? While we have estab- 
lished upper bounds, we have not proved achievability of 
these bounds. How might one prove that these bounds 
are achievable? 

Lloyd has also proposed an expression involving a 
maximum of the coherent information as the channel ca- 
pacity. 



max/(p,A/'), 
p 



(10.1) 



and outlines a technique involving random coding for 
achieving rates up to this quantity. The criterion for 
reliable transmission used by Lloyd appears to be the 
subspace fidelity criterion of eqtn. ( |6.7[ ). As noted ear- 
lier, this criterion is at least as strong as the criterion 
based on entanglement fidelity which we have been us- 
ing, that is, asymptotically good coding schemes with 
respect to subspace fidelity are also asymptotically good 
with respect to the entanglement fidelity. 

Suppose one applies coding schemes to achieve rates 



up to (10.1), but with the basic system used in blocking 



taken to be n of the old systems. Then it is clear that 
rates up to 



/(/9,AA®") 

max 

p n 



(10.2) 



may be achieved using such coding schemes, where the 
maximization is done over density operators for n copies 
of the source. It follows that rates up to 



/(p,AA®") 



lim max ■ 

n— >oo p n 



(10.3) 



may be achieved. This quantity is simply the bound (9.2) 
which we found earlier for noisy channels with the class 
of encodings restricted to be unitary. As remarked in 
the last section, it is in general not possible to identify 
the quantity appearing in the previous equation with the 
quantity (10.1), because the coherent information is not, 
in general, subadditive, cf. eqtn. (7.54). 

The coding schemes considered by Lloyd appear to be 
restricted to be projections followed by unitaries. We 
call such encodings restricted encodings^ since they do 
not cover the full class of encodings possible. For the 
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purposes of proving upper bounds it is not sufficient to 
consider a restricted class of encodings, since it is possible 
that other coding schemes may do better, and therefore 
that the capacity is somewhat larger than ( |10.2[ ). We 
suspect that this is not the case, but have been unable 
to provide a rigorous proof. A heuristic argument is pro- 



vided in subsection XA 



In the light of these remarks it is interesting that the 
coding scheme of Shor and Smolin psl provides an ex- 



ample where rates of transmission exceeding (10.1) arc 
obtained. Nevertheless, the general bound ( |9.12 ) must 
still be obeyed by a coding scheme of their type. 

However, one can s till m ake progress towards a proof 
that the expression, ( 9.12 ), which bounds the channel 
capacity, is the correct capacit y. If we accept that it is 
possible to attain rates up to ( 10. 2|), th en the following 
four-stage construction shows that ( 9.12| ) is a correct ex- 
pression for the capacity; i.e. that in addition to being an 
upper bound as shown in section IX, it is also achievable. 
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FIG. 7. 
Noisy quantum channel with an extra stage, a re- 
stricted pre-encoding, hi. 

For a fixed block size, n, one finds an encoding, C", for 
which the maximum in 



C„ = max/(p„C") 



C^,P. 



(10.4) 



is achieved. One then regards the composition A/"®" o C" 
as a single noisy quantum channel, and applies the 
achievability result on restricted encodings to the joint 
channel A/"®" o C" to achieve an even longer mn block 
coding scheme with high entanglement fidelity. 

This gives a joint coding scheme ^™" o ((J")®™ which 
for sufficiently large blocks m and n can come arbitrarily 
close to achieving the channel capacity (9.12). An im- 



portant open question is whether ( 9.12 ) is equal to (3.2). 
It is clear that the former expression is at least as large 
as the latter; we give a heuristic argument for equality in 
the next subsection, but rigorous results are needed. 
Thus, we think it likely that the expression (9.2) will 



turn out to be the maximum achievable rate of reliable 
transmission through a quantum channel. But this is 
still not satisfactory as an expression for the capacity, 
because of the difficulty of evaluating the limit involved. 
At a minimum, we would like to know enough about the 
rate of convergence of C„ to its limit to be able to ac- 
curately estimate the error in a numerical calculation of 
capacity, thus providing an effective procedure for calcu- 
lating the capacity to any desired degree of accuracy. It 
would also be useful to know whether the coherent infor- 
mation is concave in the input density operator; if this 



were so, it would greatly aid in establishing that one has 
reached a maximum, since in this case a local maximum 
is guaranteed to be a global maximum (Appendix g) . 



A. Unitary versus Nonunitary Encoding 

For the purposes of obtaining a capacity theorem for 
general encodings and decodings, a restriction on the 
class of encodings is clearly unacceptable. For example, 
given a source density operator whose eigenvalues are 
not all equal, we may not even be able to send it reliably 
through a noiseless channel whose capacity is just greater 
than the source entropy rate without doing non-unitary 
compression as described in |24|-|2q1. This compression, 
which is essentially projection onto the typical subspace 
|24| of the source, is not a unitary operation, and thus 
we expect that nonunitary operations will be essential to 
showing achievability of the noisy channel capacity. 

We conjecture that once the projection onto the typical 
source subspace is accomplished, nonunitary operations 
are of no further use in achieving reliable transmission 
through a noisy channel. Although we have not yet rig- 
orously shown this, wc give a heuristic argument below. 
If the conjec ture is tru e the n it can be used to show that 
expressions (9^) and ( 9.12 ) are equal. 

Our heuristic argument applies only to sources for 
which a typical subspace p4| exists. This includes all 
i.i.d. sources, for which the output is of the form pf". 
Let A be the projector onto the typical subspace after n 
uses of the source, and A the projector onto the orthog- 
onal subspace. Given any positive 6 it is true that for 
sufhciently large n. 



tr(Apf "A) < S. 



(10.5) 



Defining the restriction of the source to the typical sub- 
space, 



Pt 



Apf"A 
tr(Ap®"A) '' 



(10.6) 



and applying the continuity lemma for entanglement fi- 
delity, (pT4), we see that 



|Fe(pJ,£)-Fe(pr,£)|< 



4(5 



(1-^)2' 



(10.7) 



for any trace-preserving operation £. By choosing n suf- 
ficiently large S can be made arbitrarily small, and thus 
we see that for the entanglement fidelity for the source 
to be high asymptotically, it is necessary and sufficient 
that the entanglement fidelity be high asymptotically for 
the restriction of the source to the typical subspace. 

We now come to the heuristic argument. In order that 
the entanglement fidelity for the total channel be high, it 
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is asymptotically necessary and sufficient that the com- 
posite operation 2?" o A/"®" o C" have high entanglement 
fidelity when the source is restricted to the typical sub- 
space, T. Hence, if an encoding C" is nonunitary on r, it 
must be "close to reversible" on r, and I?"oA/'®" must be 
close to reversing it. In flj] it is shown that perfect re- 
versibility of an operation on a subspace M is equivalent 
to the statement that the operation, restricted to that 
subspace, may be represented by operators {y^UiPi\i}, 

where PmU^MiPM — SijPM and Pm is the projector onto 
M. That is, the operation randomly (with probabilities 
Pi) chooses a unitary which moves the state into one of 
a mutually orthogonal set of subspaces. Hence C", in its 
action on the source 's typical subspace, is close to some 
perfectly reversible operation C" consisting of "randomly 
picking a unitary into an orthogonal subspace." Hence 
the entanglement fidelity of the total operation T is close 
to that of XT, in which the encoding C" is replaced with 
C" . The linearity of the entanglement fidelity in the op- 
eration implies that for at least one of the unitaries Ui in 
the random-unitaries representation of the perfectly re- 
versible operation C", the entanglement fidelity is at least 
as good if the unitary is substituted for C". Therefore, 
arbitrary encodings C" are close to unitary encodings of 
T into a subspace of the channel's Hilbert space. Thus 
the only nonunitarity which it is necessary to consider is 
the restriction to the source's typical subspace. 



XI. CHANNELS WITH A CLASSICAL 
OBSERVER 

In this section we consider a generalized version of the 
quantum noisy channel coding problem. Suppose that 
in addition to a noisy interaction with the environment 
there is also a classical observer who is able to perform a 
measurement. This measurement may be on the channel 
or the environment of the channel, or possibly on both. 

The result of the measurement is then sent to the de- 
coder, who may use the result to assist in decoding. We 
assume that this transmission of classical information is 
done noiselessly, although it is also interesting to con- 
sider what happens when the classical transmission also 
involves noise. It can be shown ||ll| that the state re- 
ceived by the decoder is again related to the state p used 
as input to the channel by a quantum operation Afm, 
where m is the measurement result recorded by the clas- 
sical observer, 



Knjp) 
tv{Afrnip))' 



(11.1) 



The basic situation is illustrated in figure H. The idea 
is that by giving the decoder access to classical informa- 
tion about the environment responsible for noise in the 
channel it may be possible to improve the capacity of 



that channel, by allowing the decoder to choose different 
decodings Vm depending on the measurement result m. 
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FIG. 8. 
Noisy quantum channel with a classical observer. 

We now give a simple example which illustrates that 
this can be the case. Suppose we have a two-level sys- 
tem in a state p and an initially uncorrelated four-level 
environment initially in the maximally mixed state //4, 
so the total state of the joint system is 



p®-. 



(11.2) 



We fix an orthonormal basis |1), |2), |3), |4) for the envi- 
ronment. We assume that a unitary interaction between 
the system and environment takes place, given by the 
unitary operator 



[/ = /®|l)(l| + cr^®|2)(2| + 

CTj,(^|3)(3| + (T, ®|4)(4|. 
The output of the channel is thus 



p -^ Af{p) = iYE[u[p 



c/t 



(11.3) 



(11.4) 



The quantum operation A/" can be given two particularly 
useful forms, 



1 



N{p) = - {I pi + cr^pa^ + CTypUy + a^paz) 



(11.5) 
(11.6) 



It is not difficult to show from the second form that 



C{U) = 0, 



(11.7) 



and thus the channel capacity for the channel A/" is equal 
to zero. Suppose now that an observer is introduced, who 
is allowed to perform a measurement on the environment. 
This measurement is a Von Neumann measurement in the 
|1), |2), |3), |4) basis, and yields a corresponding measure- 
ment result, ?7i = 1, 2, 3, 4. Then the quantum operations 
corresponding to these four measurement outcomes are 

Niip) - -o (11.8) 



4' 
A^2(/o) = -crxP<7x 

■^3{P) = -^CTyPay 

I^i{p) = J<JzP<7z- 



(11.9) 
(11.10) 
(11.11) 
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Each of these is unitary, up to a constant multiplying 
factor, so conditioned on knowing the measurement re- 
sult m, the corresponding channel capacity Cm is perfect. 
That is, 



Cn,, = 1 



(11.12) 



for all measurement outcomes m. This is an example 
where the capacity of the observed channel is strictly 
greater than for the unobserved channel. 

This result is particularly clear in the context of tele- 
portation. Nielsen and Caves |jlffl showed that the prob- 
lem of teleportation can be understood as the problem 
of a quantum noisy channel with an auxiliary classical 
channel. In the single qubit teleportation scheme of Ben- 
nett et al Q there are four quantum operations relating 
the state Alice wishes to teleport to the state Bob re- 
ceives, corresponding to each of the four measurement 
results. In that scheme it happens that those four opera- 
tions are the Afm we have described above. Furthermore 
in the absence of the classical channel, that is, when Alice 
does not send the result of her measurement to Bob, the 
channel is described by the single operation Af. Clearly, 
in order that causality be preserved we expect that the 
channel capacity be zero. On the other hand, in order 
that teleportation be able to occur we expect that the 
channel capacity Cm = 1, as was shown above. Telepor- 
tation understood in this way as a noisy channel with 
a classical side channel offers a particularly elegant way 
of seeing that the transmission of quantum information 
may sometimes be greatly improved by making use of 
classical information. 

The remainder of this section is organized into three 



subsections. Subsection XI A proves bounds on the ca- 
pacity of an observed channel. This requires nontrivial 
extensions of the techniques developed earlier for proving 
bounds on the capacity of an unobserved channel. Sub- 



section KIB relates work done on the observed channel 
to the work done on the unobserved channel. Subsec- 
tion XI C discusses possible extensions to this work on 



observed channels. 



A. Upper Bounds on Channel Capacity 

We now prove several results bounding the channel ca- 
pacity of an observed channel, just as we did earlier for 
the unobserved channel. The following lemma general- 
izes the earlier entanglement fidelity lemma for quantum 
operations, which was the foundation of our earlier proofs 
of upper bounds on the channel capacity. 

Lemma (generalized entanglement fidelity lemma for 
operations) 

Suppose £m are a set of quantum operations such that 
X)m '^™ ^^ ^ trace-preserving quantum operation. Sup- 
pose further that Vm is a trace-preserving quantum op- 
eration for each m. Then 



S{p)<Y,iri£m{p))Hp,£m) + 2 + 

in 

4(l-Fe(p,r))log2(i, (11.13) 



where 



r = ^p,„of„,. (11.14) 

771 

By the second step of the data processing inequality. 



(7.9), I{p,£m) > I{p,T^m ° ^771) for each m, and not- 
ing also that by the trace-preserving property of !?,„, 
tr{£mip)) = tr((I?„ o£m){p)), we obtain 

Sip) < Sip) + J2 M£mip))Iip,£m)- 

m 
iriiVm O £m)ip))Iip,Vm O £m)] ■ (11.15) 

Applying the generalized convexity theorem for coherent 
information ( |7.24| ) gives 

- ^triiVm o£m)ip))Iip,'Dm ° £m) < -/(p,T). 

m 

(11.16) 
We obtain 

Sip) < J2 t<£rnip))Iip, £rn) + Sip) - Hp, T) . (11.17) 

771 

But T = ^m '^m ° ^m ^s trace-preserving since Vm 
is trace-preserving and ^^ £m is trace-preserving, and 
thus by d^H), 

Sip) - lip, T) = Sip) ~ SiTip)) + Seip, T) (11.18) 
<2Seip,T). (11.19) 

Fina lly, an application of the quantum Fano inequality 
( |6.9|) along with the observations that the entropy func- 
tion h appearing in that inequality is bounded above by 
one, and log(d^ ~ 1) < 21og(i, gives 



Sip) < J2 ^^■((^" ° £rn)ip))Iip, Vm O £m) + 2 ■ 



4(l-Fe(p,r))logd, 

as we set out to prove. 
If we define 

Ci{Nm}) = hm max 

^ triiNm,®---®Nm„oDip)) 
Iip,Mm^®---®Mm^ OC") 



(11.20) 



(11.21) 



we may use (11.13) to easily prove that Ci{Nm}) is an 
upper bound on the rate of reliable transmission through 
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an observed channel, in precisely the same way we earlier 



used (7.34) to prove bounds for unobserved channels. 

We may derive the same bound in another fashion if 
we associate observed channels with tracepreserving un- 
observed channels in the following fashion suggested by 
examples in [||. To an observed channel {Afm} we asso- 
ciate a single tracepreserving operation A4 from He to 
a larger Hilbert space Hc^ M, where M is a "register" 
Hilbert space. Each dimension of M corresponds to a dif- 
ferent measurement result, m. The operation is specified 
by: 



M{p) = ^Knif) «) \m){m\, 



(11.22) 



where \m) is some set of orthogonal states corresponding 
to the measurement results which may occur. This map 
is an "all-quantum" version of the observed channel. 

Since our upper bounds to the capacity of an unob- 
served channel apply also to channels with output Hilbert 
spaces of different dimensionality than the input space, 
they apply to this map as well. It is easily verified that 
the coherent information for the map M acting on p is 
the same as the average coherent information for the ob- 
served channel A/'m acting on p, which appears in (11.12) 
and in the bound (11.21). To show this, define 



Pr, 



tiiAf„,{pQ)), 



(11.23) 



where we are again working in the RQ picture of opera- 
tions. Then p'^' = M{p^) is given by ( |11.22D , so that 



S{p^)=H{p^)+Y.Pr^S{^^^^^^\ 

Vrn 



.Q'y 



(11.24) 



So an application of the bound ( 9.12| ) on the rate of trans- 
mission through the unobserved channel M. shows the ex- 
pression on the right hand side of ( 11.2l| ) which bounds 
the capacity of the observed channel {Mm} also bounds 
the capacity oiM. This result provides some evidence for 
the intuitively reasonable proposition that M. and {Mm} 
are equivalent with respect to transmission of quantum 
information. 

Bennett et. al H derive capacities for three simpl e 
channels which may be viewed as taking the form ( 11.22 ). 
The quantum erasure channel takes the input state to a 
fixed state orthogonal to the input state with probabil- 
ity e; otherwise, it transmits the state undisturbed. An 
equivalent observed channel would with probability e re- 
place the input state with a standard pure state |0)(0| 
within the input subspace, and also provide classical in- 
formation as to whether this replacement has occurred or 
not. The phase erasure channel randomizes the phase of 
a qubit with probability 5, and otherwise transmits the 
state undisturbed; it also supplies classical information 
as to which of these alternatives occurred. The mixed 
erasure /phase- erasure channel may either erase or phase- 
erase, with exclusive probabilities e and 5. Bennett et. 
al. note that the capacity max(0, 1 — 2e) of the erasure 
channel is in fact the one-shot maximal coherent infor- 
mation. We have verified that the capacities they de- 
rive for the phase-erasure channel (1 — ^) and the mixed 
erasure/phase-erasure channel max(0, 1 — 2e — 5) are the 
same as the one-shot maximal average coherent infor- 
mation for the corresponding observed channels, lending 
some additional support to the view that the bounds we 
have derived here are in fact the capacities. 



since the density matrices Mm {p^)® \m){m\ are mutually 
orthogonal. Also, 



B. Relationship to Unobserved Channel 



r^R'Q' — 



(X( 



E 



ACJ(P^'') 



where by definition M^{p) — Mm{p) ® \m){m\. Thus 

S[p "-^ ) = H[p,n) + 2_^PmS( ). (11.26) 



Pn 



Hence the coherent information is 



(11.27) 

which can be rewritten as the average coherent informa- 
tion for {Mm}, 



/(p'?,X)=^p„J(pQ,AA„). 



[ii.Zbj Suppose a quantum system passes through a chan- 

nel, interacts with an environment, and then measure- 
ments are performed on the environment alone. How is 
the capacity of this observed channel related to the ca- 
pacity of the channel which results if no measurement 
had been performed on the environment? Physically, it 
is clear that the capacity when measurements are per- 
formed must be at least as great as when no measure- 
ments on the environment are performed, since the de- 
coder can always ignore the result of the measurement. 
In this subsection we show that bounds we have derived 
on channel capacity have this same property: observa- 
tion of the environment can never decrease the bounds 
we have obtained. 

Suppose {Mm} are the operations describing the differ- 
ent possible measurement outcomes. Then the operation 
f 11 28~) describing the same channel, but without any observation 
of the environment, is 
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AA-^AAm. 



(11.29) 



Recall the expressions for the bound on the capacity 
of the unobserved channel, 



lini max 



C{N) 
and the observed channel 



/(p,7V®"oC") 



(11.30) 



C{{Nm\) = hm max 
^ tr((AA„, ®...®AA„„oC")(p)) 
/(p.A/"™, (g)---(g)A/™„oC") 



(11.31) 



But the generalized convexity theorem (7.24) for coher- 
ent information implies that 

Y. tr((AA™, ®...®AA™„oC")(p)) 

mi ,...,nin 



< 



/(p,A/'®"oC") 



and thus 



C{U) < C({AA„}). 



(11.32) 



(11.33) 



To see that this inequality may sometimes be strict, 
return to the example considered earlier in this section 
in the context of teleportation. In that case it is not 
difficult to verify that 



- C{Af) < C({M„}) = 1. 



(11.34) 



What these results show is that our bounds on the 
channel capacity are never made any worse by observ- 
ing the environment, but sometimes they can be made 
considerably better. This is a property that we certainly 
expect the quantum channel capacity to have, and we 
take as an encouraging sign that the bounds we have 
proved in this paper are in fact achievable, that is, the 
true capacities. 



C. Discussion 

All the questions asked about the bounds on channel 
capacity for an unobserved channel can be asked again 
for the observed channel: questions about achievability 
of bounds, the differences in power achievable by differ- 
ent classes of encodings and decodings, and so on. We 
do not address these problems here, beyond noting that 
they are important problems which need to be addressed 
by future research. 



Many new twists on the problem of the quantum noisy 
channel arise when an observer of the environment is al- 
lowed. For example, one might consider the situation 
where the classical channel connecting the observer to 
the decoder is noisy. What then are the resources re- 
quired to transmit coherent quantum information? 

It may also be interesting to prove results relating the 
classical and quantum resources that are required to per- 
form a certain task. For example, in teleportation it can 
be shown that one requires not only the quantum chan- 
nel, but also two bits of classical information, in order to 
transmit quantum information with perfect reliability. 



XII. CONCLUSION 

In this paper we have shown that different informa- 
tion transmission problems may result in different chan- 
nel capacities for the same noisy quantum channel. We 
have developed some general techniques for proving up- 
per bounds on the amount of information that may be 
transmitted reliably through a noisy quantum channel. 

Perhaps the most interesting thing about the quantum 
noisy channel problem is to discover what is new and es- 
sentially quantum about the problem. The following list 
summarizes what we believe are the essentially new fea- 
tures: 

1. The insight that there are many essentially differ- 
ent information transmission problems in quantum 
mechanics, all of them of interest depending on the 
application. These span a spectrum between two 
extremes: 

• The transmission of a discrete set of mutu- 
ally orthogonal quantum states through the 
channel. Such problems are problems of trans- 
mitting classical information through a noisy 
quantum channel. 

• The transmission of entire subspaces of quan- 
tum states through the channel, which neces- 
sarily keeps all other quantum resources, in- 
cluding entanglement, intact. This is likely 
to be of interest in applications such as quan- 
tum computation, cryptography and telepor- 
tation where superpositions of quantum states 
are crucial. Such problems are problems of 
transmitting coherent quantum information 
through a noisy quantum channel. 

Both these cases and a variety of intermediate cases 
are important for specific applications. For each 
case, there is great interest in considering different 
classes of allowed encodings and decodings. For 
example, it may be that encoding and decoding 
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can only be done using local operations and one- 
way classical communication. This may give rise 
to a different channel capacity than occurs if we al- 
low non-local encoding and decoding. Thus there 
are different noisy channel problems depending on 
what classes of encodings and decodings are al- 
lowed. 

The use of quantum entanglement to construct ex- 
amples where the quantum analogue of the classical 
pipelining inequality H{X : Z) < H{Y : Z) for a 
Marko v pro cess X ^ Y ^ Z, fails to hold (cf. 
eqtn. (|7.43|) ). 



3. The use of quantum entanglement to construct ex- 
amples where the subadditivity property of mutual 
information, 

H{X^, . . . , X„ : Fi, . . . , r„) < ^ H{X, : Y,), 

i 

(12.1) 



fails to hold (cf. eqtn. (7.5^ 



There are many more interesting open problems asso- 
ciated with the noisy channel problem than have been 
addressed here. The following is a sample of those prob- 
lems which we believe to be particularly important: 

1. The development of an effective procedure for de- 
termining channel capacities. We believe that this 
is the most important problem remaining to be ad- 
dressed. Assuming our upper bound 



C{N) = lim max/(p,AA®"oC) 



1— >oo p.C 



(12.2) 



is, in fact, the channel capacity for general encod- 
ings, it still remains to find an effective procedure 
for evaluating this quantity. Both maximizations 
can be done relatively easily, since they are of a 
continuous function over a compact set. However, 
we do not yet understand the convergence of the 
limit well enough to have an effective procedure for 
evaluating this quantity. 

2. Once an effective procedure has been obtained for 
evaluating channel capacities, it still remains to de- 
velop good numerical algorithms for performing the 
evaluation. Assuming that the evaluation involves 
some kind of maximization of the coherent informa- 
tion, it becomes important to know whether the co- 
herent information is concave in p. Combined with 
the convexity of the coherent information in the 
operation this would give a powerful tool for the 
development of numerical algorithms for the deter- 
mination of channel capacity. 



3. Estimation of channel capacities for realistic chan- 
nels. This work could certainly be done theoret- 
ically and perhaps also experimentally. Recent 
work on quantum process tomography p7| , |2q ] points 
the way toward experimental determination of the 
quantum channel capacity. A related problem is 
to analyze how stable the determination of channel 
capacities is with respect to experimental error. 



4. As suggested in subsection |IXC] it would be inter- 
esting to see what channel capacities are attainable 
for different classes of allowable encodings and / 
or decodings, for example, encodings where the en- 
coder is only allowed to do local operations and one- 
way classical communication, or encodings where 
the encoder is allowed to do local operations and 
two-way classical communication. We have showed 
how to prove bounds on the channel capacity in 
these cases; whether these bounds are attainable is 
unknown. 

5. The development of rigorous general techniques for 
proving attainability of channel capacities, which 
may be applied to different classes of allowed en- 
codings and decodings. 

6. Finding the capacity of a noisy quantum channel 
for classical information. A related problem arises 
in the context of superdense coding, where one half 
of an EPR pair can be used to send two bits of clas- 
sical information. It would be interesting to know 
to what extent this performance is degraded if the 
pair of qubits shard between sender and receiver is 
not an EPR pair, but rather the sharing is done 
using a noisy quantum channel, leading to a de- 
crease in the number of classical bits that can be 
sent. Given a noisy quantum channel, what is the 
maximum amount of classical information that can 
be sent in this way? 

7. All work done thus far has been for discrete chan- 
nels, that is, channels with finite dimensional state 
spaces. It is an important and non-trivial problem 
to extend these results to channels with infinite di- 
mensional state spaces. 

8. A more thorough study of noisy channels which 
have a classical side channel. Can the classical in- 
formation obtained by an observer be related to 
changes in the channel capacity? What if the clas- 
sical side channel is noisy? Many other fascinat- 
ing problems, too many to enumerate here, suggest 
themselves in this context. 

There are many other ways the classical results on 
noisy channels have been extended - considering chan- 
nels with feedback, developing rate- distortion theory, un- 
derstanding networks consisting of more than one chan- 
nel, and so on. Each of these could give rise to highly 
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interesting work on noisy quantum channels. It is also 
to be expected that interesting new questions will arise 
as experimental efforts in the field of quantum informa- 
tion develop further. Perhaps of chief interest to us is 
to develop a still clearer understanding of the essential 
differences between the quantum noisy channel and the 
classical noisy channel problem. 
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c„ n 

m 
>--2, 

n 



(A6) 

(A7) 



where [xj is the integer imm ediately below x. Plugging 
the last inequality into (A5) gives 



m ~ n \ m 



But —n/m > — e and c„/n > c — e, so 
^>(c-.)(l-e). 



(A8) 



(A9) 



This equation holds for all sufficiently large m, and thus 



liminf — > (c — e)(l — e). 
n n 



(AlO) 



But e was an arbitrary number greater than 0, so letting 
e ^- we see that 



lim inf — > c — lim sup — . 

n n n n 



(All) 



APPENDIX A: EXISTENCE OF LIMITS 



It follows that lim„ Cn/n exists, as claimed. 



This appendix contains a lemma that can be used to 
prove the existence of several limits that appear in this 
paper. 

Lemma: Suppose ci, C2, . . . is a nonnegative sequence 
such that c„ < kn for some fc > 0, and 



for all m and n. Then 



m+n^ 



1 . ^n 



(Al) 
(A2) 



exists and is finite. 
Proof 
Define 



1. ^n 

c = lim sup — . 



(A3) 



This always exists and is finite, since c„ < kn for some 
/c > 0. Fix e > and choose n sufficiently large that 



> c 



(A4) 



Suppose m is any integer strictly greater than 



max(n,ri/e). Then by (Al), 



m ~ n m 



(A5) 



Using the fact that /c„ < q„, (an immediate consequence 
of (^) with l=\^\-\ gives 



APPENDIX B: MAXIMA OF THE COHERENT 
INFORMATION 

Various convexity and concavity properties are useful 
in calculating classical channel capacities, and the same is 
true in the quantum situation. This appendix is devoted 
to an explication of the basic properties of convexity and 
concavity related to the coherent information and the re- 
lation of these properties to expressions such as ( p.l2| ) . 

A convex set, S, is a subset of a vector space such 
that given any two points si, S2 € S and any A such that 
< A < 1, then the convex combination, Asi -|- (1 — A)s2j 
is also an element of S. Geometrically, this means that 
given any two points in the set, the line joining them is 
also in the set. An extremal point of 5' is a point s which 
cannot be formed from the convex combination of any 
other two points in the set. A convex function f on S 
is a real-valued function such that for any A satisfying 
0< A < 1, 



/(Asi + (1 - X)S2) < A/(si) + (1 - A)/(S2); 



(Bl) 



a concave function satisfies the same condition but with 
the inequality reversed. 

The first useful fact about maxima is the following: 
Local maximum is a global maximum: Suppose / is a 
concave function on a convex set 5*. Then a local maxi- 
mum of / is also a global maximum of /. 

This follows by supposing that si and S2 are distinct 
local maxima. If /(si) < /(S2), say, then 
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/(Asi + (l-A),s2)>A/(si) + (l 
>/(si), 



A)/(s2) (B2) 

(B3) 



by concavity of /. By choosing sufficiently small values 
of A we see that this violates the fact that si is a lo- 
cal maximum. Thus / has the same value for all local 
maxima, from which it follows that all local maxima are 
also global maxima for the function. If the coherent in- 
formation turns out to be concave in the input density 
operator, this property will be useful in evaluating ca- 



pacity bounds such as (9.12). 

The following lemma, from 129], is extremely useful 
in computing the maxima of convex functions on convex 
sets. 

Convexity Lemma: Suppose / is a continuous convex 
function on a compact, convex set, S. Then there is an 
extremal point at which / attains its global maximum. 

The proof is obvious. The reason for our interest in 
the proof is because for fixed p and trace-preserving op- 
erations £, the coherent information I{p,£) is a convex, 
continuous function of the operation £. The set of trace- 
preserving quantum operations forms a compact, convex 
set, and thus by the convexity lemma, /(p, £) attains its 
maximum for a quantum operation £ which is extremal 
in the set of all trace-preserving quantum operations. 

Choi p3[ has proved that any extremal point in the 
set of trace-preserving quantum operations has a set of 
operation elements {Ai} such that 

1. There are at most d elements Ai. This is to be 
contrasted with the general situation, where there 
may be up to (P elements. 



2. The Ai are linearly independent. 

This result provides a considerable saving in the class 
of quantum operations that must be optimized over in 
order to numerically calculate expressions of the form 
( 3.12 ). Unfortunately, this only takes us p art of the way 
towards proving that the expressions (|9.12|) and (|9.2| ) are 
identically equal, or, alternatively, it suggests a starting 
point for a search for counterexamples to the proposition 
that the two quantities are equal. If the extremal points 
of the set of quantum operations were the unitary oper- 
ations we would be done. However that is not the case, 
as the above theorem shows. 
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